Combining protein secondary structure prediction models with ensemble methods of optimal complexity

نویسندگان

  • Yann Guermeur
  • Gianluca Pollastri
  • André Elisseeff
  • Dominique Zelus
  • Hélène Paugam-Moisy
  • Pierre Baldi
چکیده

Many sophisticated methods are currently available to perform protein secondary structure prediction. Since they are frequently based on di,erent principles, and di,erent knowledge sources, signi>cant bene>ts can be expected from combining them. However, the choice of an appropriate combiner appears to be an issue in its own right. The >rst di@culty to overcome when combining prediction methods is over>tting. This is the reason why we investigate the implementation of Support Vector Machines to perform the task. A family of multi-class SVMs is introduced. Two of these machines are used to combine some of the current best protein secondary structure prediction methods. Their performance is consistently superior to the performance of the ensemble methods traditionally used in the >eld. They also outperform the decomposition approaches based on bi-class SVMs. Furthermore, initial experimental evidence suggests that their outputs could be processed by the biologist to perform higher-level treatments. c © 2003 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

Combining Statistical Models for Protein Secondary Structure Prediction

We investigate the problem of combining experts to predict the secondary structure of globular proteins. We first present two different statistical models for this task. We then analyse an efficient linear combination technique, this sheds light on unexplained phenomena frequently encountered in practice for ensemble methods.

متن کامل

Electricity Load Forecasting by Combining Adaptive Neuro-fuzzy Inference System and Seasonal Auto-Regressive Integrated Moving Average

Nowadays, electricity load forecasting, as one of the most important areas, plays a crucial role in the economic process. What separates electricity from other commodities is the impossibility of storing it on a large scale and cost-effective construction of new power generation and distribution plants. Also, the existence of seasonality, nonlinear complexity, and ambiguity pattern in electrici...

متن کامل

A General Method for Combining Predictors Tested on Protein Secondary Structure Prediction

Ensemble methods, which combine several classifiers, have been successfully applied to decrease generalization error of machine learning methods. For most ensemble methods the ensemble members are combined by weighted summation of the output, called the linear average predictor. The logarithmic opinion pool ensemble method uses a multiplicative combination of the ensemble members, which treats ...

متن کامل

Profiles and Majority Voting-Based Ensemble Method for Protein Secondary Structure Prediction

Machine learning techniques have been widely applied to solve the problem of predicting protein secondary structure from the amino acid sequence. They have gained substantial success in this research area. Many methods have been used including k-Nearest Neighbors (k-NNs), Hidden Markov Models (HMMs), Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), which have attracted atte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 56  شماره 

صفحات  -

تاریخ انتشار 2004